rv8: a high performance RISC-V to x86 binary translator
نویسندگان
چکیده
Dynamic binary translation has a history of being used to ease transitions betweenCPU architectures[7], includingmicro-architectures. Modern x86 CPUs, while maintaining binary compatibility with their legacy CISC instruction set, have internal micro-architectures that resemble RISC. High performance x86micro-architectures have long used a CISC decoder front-end to crack complex instructions into smaller micro-operations. Recently macro-op fusion [17][6] has been used to combine several instructions into one micro-op. Both techniques change the shape of the ISA to match the internal μop micro-architecture. Well-known binary translators also use micro-op internal representations to provide an indirection between the source and target ISAs as this makes the addition of new instruction sets much easier. We present rv8, a high performance RISC-V simulation suite containing a RISC-V JIT (Just In Time) translation engine specifically targeting the x86-64 instruction set. Achieving the best possible mapping from a given source to target ISA pair requires a purpose designed internal representation. This paper explores a simple and fast translation from RISC-V to x86-64 that exploits knowledge of the geometries of the source and target ISAs, ABIs, and current x86-64 micro-architectures with a goal of producing a near optimal mapping from the 31 register source ISA to the 16 register target ISA. Techniques are explored that exploit the CISC encoding to increase instruction density, coalescing micro-ops into CISC instructions with the goal of generating the minimum number of micro-ops at the target micro-architectural level.
منابع مشابه
Low-Complexity Dynamic Translation in VDebug
Machine-level dynamic binary translation has been used in applications ranging from debugging, performance analysis, and security policy enforcement to full machine virtualization. Most implementations are optimized for performance rather that simplicity: they translate to an internal intermediate form before generating target code. While an intermediate form greatly assists certain types of co...
متن کاملDIGITAL FX!32: Combining Emulation and Binary Translation
Vol. 9 No. 1 1997 3 Three factors contribute to the success of a microprocessor: price, performance, and software availability. The DIGITAL FX!32 product addresses the third factor, software availability, by making hundreds of new applications available on Alpha-based platforms running the Windows NT operating system. DIGITAL FX!32 software combines emulation and binary translation to provide f...
متن کاملBinary Translation Using Peephole Superoptimizers
We present a new scheme for performing binary translation that produces code comparable to or better than existing binary translators with much less engineering effort. Instead of hand-coding the translation from one instruction set to another, our approach automatically learns translation rules using superoptimization techniques. We have implemented a PowerPC-x86 binary translator and report r...
متن کاملThe Design and Implementation Ocelot’s Dynamic Binary Translator from PTX to Multi-Core x86
Ocelot is a dynamic compilation framework designed to map the explicitly parallel PTX execution model used by NVIDIA CUDA applications onto diverse many-core architectures. Ocelot includes a dynamic binary translator from PTX to many-core processors that leverages the LLVM code generator to target x86. The binary translator is able to execute CUDA applications without recompilation and Ocelot c...
متن کاملThe Renewed Case for the Reduced Instruction Set Computer: Avoiding ISA Bloat with Macro-Op Fusion for RISC-V
This report makes the case that a well-designed Reduced Instruction Set Computer (RISC) can match, and even exceed, the performance and code density of existing commercial Complex Instruction Set Computers (CISC) while maintaining the simplicity and cost-effectiveness that underpins the original RISC goals [12]. We begin by comparing the dynamic instruction counts and dynamic instruction bytes ...
متن کامل